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We introduce a voting model and discuss the scale invariance in the mixing of candidates. 
The Candidates are classified into two categories [i £ {0, 1} and are called as 'binary' candi- 
dates. There are in total N = Nq + Ni candidates, and voters vote for them one by one. The 
probability that a candidate gets a vote is proportional to the number of votes. The initial 
number of votes ('seed') of a candidate \x is set to be s^. After infinite counts of voting, the 
probability function of the share of votes of the candidate p, obeys gamma distributions with 
the shape exponent s^ in the thermodynamic limit Zq = N\S\ -\-NqSq — > oo. Between the cu- 
mulative functions {x^} of binary candidates, the power-law relation 1 — xi ~ (1 — xq) 01 with 
the critical exponent a = s\/so holds in the region 1 — xq, 1 — x\ « 1. In the double scaling 
limit (s\,sq) — > (0,0) and Zq — > oo with s\/so = a fixed, the relation 1 — x\ = (1 — xo) a 
holds exactly over the entire range < xq, x% < 1. We study the data on horse races obtained 
from the Japan Racing Association for the period 1986 to 2006 and confirm scale invariance. 

KEYWORDS: scale invariance, voting model, branching process, gamma distribution, ROC, 
accuracy ratio 



1. Introduction 

Scale-invariant behaviour has attracted considerable attention on account of its ubiquity 
in natural and man-made phenomena. ^ Many possible candidate mechanisms that gives rise 
to power-law distributions have been proposed thus far. The Yule process is a widely appli- 
cable mechanism for generating power-law distributions. 2 -* Originally, it has been proposed 
to explain why the distribution of the number of species in a genus, a family, or any other 
taxonomic group follows a power law. 3 ) Now, it has found wide applications in other areas. 1 ' 4 ) 

Consider the distribution of the number of species in a genus. Suppose first that new 
species appear but they never die; species are only ever added to genera and never removed. 
Species are added to genera by speciation, the splitting of one species into two. If we assume 
that this happens at some stochastically constant rate, then it follows that a genus with k 
species will gain new species at a rate proportional to k, since each of the k species has the 
same chance per unit time of dividing into two. In addition, suppose that a new species that 
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belongs to a new genus is added once every m speciation events. Som + 1 new species appear 
for each new genus and there are m + 1 species per genus. Thus the number of genera goes 
up steadily as does the number of species within each genus. We denote the fraction of genera 
that has k species by Pk, n , where n denotes the total number of genera and n measures the 
passage of time in the model. At each time-step one new species founds a new genus, thereby 
increasing n by 1, and m other species are are added to various pre-existing genera which 
are selected in proportion to the number of species they already have. By solving the master 
equation for pk <n in the limit n — > oo, pk = lim„_ 5 . 00 pk <n behaves as pk ~ /c 2+ ™. The Yule 
process has been adopted and generalized to explain power laws in many other systems. An 
important feature of this process is that the probability that a genus with k species will gain 
new species is proportional to k. This 'rich-get-richer' process is the most important factor in 
exhibiting power-law behaviour. The feature that n increases infinitely is also important in 
generating power-law behaviour. 

In this study, we introduce a voting model, a multivariate Polya-Eggenberger model, 5 ' 6 ) 
and discuss the scale invariance in the mixing of candidates. The candidates are classified 
into two categories \x £ {0, 1} and are called as 'binary' candidates. The probability that a 
candidate get a vote is proportional to the number of votes, which is the same as the relation 
in the Yule process. The main difference between the voting model and the Yule process is 
that the number of candidates is fixed in our model. In the Yule process, n increases and in the 
limit n — > oo, power-law behaviour is observed. In our model, the distribution of the number 
of votes does not show power-law behaviour. However, our model exhibits scale-invariant 
behaviour. This behaviour is observed in the mixing of the binary candidates. Furthermore, 
the power law holds over the entire range in a double scaling limit. 

This kind of voting model has been introduced in the literatures of social-choice problems 
on preference formation in a voting population. 6, 7 ) The voting paradox, the possibility of 
individual preference patterns leading to in-transitivity, ask about the likelihood that certain 
kinds of cycles occurs, given that people can choose at random among all possible profiles, 
rankings of choices. In order that majority rule does work in decision making process, or to 
fix the Condorcet's winner, there must exist a transitive ordering among profiles. The voting 
model is a simple Polya-variety urn model. A homogeneity parameter relates to measures of 
similarity among voters. The model is a rough model for contagion diseases, such that each 
occurrence increases the chance of further occurrences. We can interpret the homogeneity 
parameter as the contagion parameter or as the amount of similarity-homogeneity among 
voters, the extent to which voters influence one another. It was concluded that as the preference 
similarity among voters increases, or stronger mutual influence among voters, there is a lesser 
chance for the paradox of occurring. Our conclusion is that in the ranking of the horses, the 
mutual influence among voters induces the scale invariance in the mixing. 
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The organization of this paper is as follows. In §2, we introduce the voting model. We 
select a candidate (initial number of votes s^) and show that the probability density function 
of the share of votes ,u, of the candidate obeys a gamma distribution function with the shape 
exponent in the thermodynamic limit Zq = N\S\ + NqSq ~~ * °o. We also show that the 
joint probability density function of u for any k candidates is given by the direct product of 
the gamma distributions in the same limit. We discuss the scale invariance in the mixing of 
the binary candidates in §3. The cumulative function 1 — of candidates fi is given by the 
incomplete gamma function. The power-law relation 1 — x± ~ (1 — Xo) a with the exponent 
a = si/sq holds in the region 1 — xq, 1 — x\ « 1. Furthermore, in the double scaling limit 
{s^t} - > and Zq — > oo with a = si/sq fixed, the relation 1 — x\ = (1 — xo) a holds exactly 
over the entire range < xq,x\ < 1. Using the data on horse races, we verify these results 
in §4. We show that scale invariance holds over the wide range of cumulative functions. In 
addition, we show that the probability distribution functions of u are well described by gamma 
distributions. Section 5 is dedicated to the summary and concluding remarks. Appendix A 
is devoted to the derivation of the joint probability distribution function of u for any k 
candidates. In Appendix B, we map the voting model to a branching process and easily derive 
the gamma distribution function. 

2. Voting Model for Binary Candidates 



Consider a voting model for N candidates. Voters vote for them one by one, and the 
result of each voting is announced promptly. The time variable t G {0, 1, 2 • • • , T} counts the 
number of the votes. The candidates are classified into two categories (i € {0, 1} and are called 
as binary candidates. There are candidates in each category and Nq + N\ = N. The main 
result of this section is that the scaled share of votes of a candidate [i obeys a gamma 
distribution with the shape exponent in the thermodynamic limit Nq,N\ — > oo. 

We denote the number of votes of zth candidate fi G {0, 1} at time t as {-Xit}ig{i,...,jVu}- 
At t = 0, X? t takes the initial value X? Q = > 0. If the iih candidate f/, gets a vote at t, 
X^ t increases by one unit. 

*& + i = *& + 1 - 

A voter casts a vote for the total N candidates at a rate proportional to X^ t . The probability 
Plf t that the ith candidate fi gets a vote at t is 

X^ 

p& = (i) 

1 JV M 

Z t = J2^X? t = N lSl +N so + t. (2) 
The problem of determining the probability of the ith candidate fx getting n votes up to 
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T is equivalent to the famous Polya's urn problem. 5 ' 6 ' 8 ' 9 - 1 If the change in X^ t is given by 

the sequence (AX^, • • • , AX^ T ) is called Polya's urn sequence. This sequence is an exchange- 
able stochastic process, and the joint distribution of (Xf x • • • , X? T ) is given by 

Prob(AXf 1 = x h ■ ■ ■ , AXf T = xt) = (s^Zo - s^T-k ^ 

Here, k = Ylt=i x t an< ^ { a )n = o ■ (o + 1) • (o + 2) • • ■ (a + n — 1) is the rising factorial. This dis- 
tribution depends only on k, and not on the particular order of (x\, • • • , Xt)- This distribution 
is invariant under the permutations of the entries and, hence, it is called exchangeable. 
Furthermore, the expectation value of AX^ t , denoted by does not depend on t. 

Pfl ^< AXft >= S f. (3) 

The correlation function p^ between Al^ and AX^ ( , (t 1 ^ t) is also constant 9 ) as p^. 

< AXf.AXf,, > -p 2 i 
,^Corr(A^,AX & ,) S ^ ^ t + t. (4) 

The probability that the ith candidate p gets n votes up to T is given by the beta binomial 
distribution 

Prober - Sfl = n) = T C n ■ (S " )n( f°T^ )T " w . (5) 
(a) n is written as (o) n = r |° + "^ and this relation can also be written as 



r» 

T {s^n)T{Z -s^ + T-n) T(Z ) 
r(« M ) T(Z - Sfl ) T(Z + T) ' 



Prob(X^ - s, = n) = T C n • K Z7' w v ~ , ^ ■ (6) 



Using a definition of beta function B(a, b) = pffiffi > we can rewr it e the expression as 

p ro b(x^ _.„=„)= tC „ . gfa+ "; z wr n> ' (7) 

Hys^ZQ — Sfj,) 

B(a,b) is also written as B(a,b) = j^ ) p a ~ 1 {l —p) b ~ 1 dp, we get the next expression 

Prob(Xf T - s ^ = n) = T C n - p n (l - pf^Pl_P)_ dp _ (8) 



B(Sf,,Z - s,j,) 

X M -s 

After infinite counts of voting, i.e. T — > oo, the share of votes x^ = lim^^oo - T T — -, becomes 
the beta distributed random variable beta(s^, Zq — s^) on [0, 1]. 

p(x) = lirn Prob(^ T - s, = Tx) ■ T = / \ ■ (9) 

Here, we use the identity hmr_ s . 0O tCtxP Tx (^ — p) T ( 1 ~ x "> ■ T = 5(x — p). This result has been 
derived by Polya. 5 ) 

Next, we focus on the thermodynamic limit Nq, Ni — >■ oo and Zq = NqSq+N\S\ —¥ oo. The 
expectation value of xf is < xf >= p^ = |p-. We introduce a variable uf = (Zq — s M — l)xf . 
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The distribution function p S(J (it) in the thermodynamic limit is given as 

p s » = lim p(xf = — — -) = -i- e -V-i. (10 ) 

Z H>oo Zjq — — 1 1 [S^j 

The share of votes, u, of a candidate fi obeys a gamma distribution function with s«. 

In general, the joint probability distribution function of the scaled share of votes of k 
different candidates becomes the direct product of k gamma distribution functions in the 
limit Z — > oo. We denote the k candidates as {(p>j,ij)}j=i,— ,k an d denote the scaled share of 
votes as {%}j=i,— ,fc- The joint probability distribution function is given as 

k 

p(«l,-" ,U k ) = JlPs^iuj). (11) 

The derivation of the result is given in Appendix A. It should be noted that in the thermo- 
dynamic limit, the correlation among {uj}j=i,— k vanishes. Hence, by mapping the voting 
problem to a continuous time branching process, we can derive the gamma distribution func- 
tion p s ^{u) easily (refer Appendix B). In the branching process, the stochastic processes of 
the increase in {X? t } are independent of each other. 

3. Scale Invariance in Mixing of Binary Candidates 

In this section, we discuss the mixing of the binary candidates. After many counts of 
voting T — ^ co, the binary candidates are distributed in the space of u according to the 
gamma distribution in the thermodynamic limit Zq — > oo. If s\ > sq, a candidate belonging 
to category fx = 1 has a higher probability of getting many votes than a candidate belonging 
to category fi = 0. Even the latter can obtain many votes. It is also possible that the former 
may get few votes. Thus, there is a mixing of the binary candidates. We see a scale invariant 
behaviour appears in the mixing. Between the cumulative functions of the binary candidates 
1 — Xa, the power-law relation 1 — x\ ~ (1 — xo) a with the exponent a = si/sq holds. 

In order to study the mixing configuration, we arrange the N candidates according to the 
size of uf as 



< ><>•••><, we {0,1}. (12) 

Using the ranking information {fJ*k}k=i,— ,Ni we draw a path {(a?o,fcj xi,fc)}fc=o,- ,N i n two- 
dimensional space (xq,xi) from (xo^^i.o) = (0,0) to (xo t N, xi,jv) = as 

1 k 

^ = tt£W' ( 13 ) 
iN V j=1 

See Fig. 1. If = fi, the path extends in direction. The pictorial representation of the 
mixing of binary objects is known as a receiver operating characteristic (ROC) curve. 10 ) If 
s\ » so, the binary candidates are well separated on the axis of u, and the first Ni candidates 
belong to category /i = 1 and the last iVo candidates belong to category fx = 0. The path goes 
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straight from (0,0) to (0, 1) and then turns right to the end point (1, 1). If s\ = sq, the path 
almost runs diagonally to the end point. If s\ > sq holds, the path resembles a upward convex 
curve from (0,0) to (1,1). 
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Fig. 1. ROC curve of mixing configuration. O represents candidate belonging to category fj, = 1. 
x represents candidate belonging to category ^ = 0. At the top of the figure, the order of three 
candidates from category /i = 1 and five candidates from category [i = is shown. 



The distribution function of the candidate [i on the axis of u is given by the gamma 
distribution with the shape exponent s^. The ROC curve (xo(i), xi(t)) of the parameter 
t £ [0, oo] is given by its cumulative function as 

/oo 
p s ^(u)du. (14) 

Using the incomplete gamma function of the first kind j(s,t) = Jq e~ u ■ u s ~ 1 du, 11 " 1 the ROC 
curve is given as 

l-x lx (t) = -—^(s fM t). (15) 

Near the end point, (xq,xi) ~ (1 5 1)> m other words, in the small u region (t ~ 0), the 
incomplete gamma function 7(5^, t) behaves as 

7(* M ,t)~f"- (16) 
As 1 — x Sfi (t) oc t S)M , the following relation holds: 

1 - xi ~ (1 - x ) a with q = — . (17) 

so 
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The density of good candidates, pi, in terms of the cumulative function of bad candidates, 
1 — xo, is given as 

pi obeys the power law with the exponent a — 1. 

Furthermore, in the limit (si,so) — > (0,0) with a = Si/sq fixed, the relation 1 — x\ = 
(1 — xq) 01 holds. The proof is given as follows. j(s, t) is expressed using Kummer's confluent 
hypergeometric function M(a,b, t) 11 ^ as 

7 ( S) t) = -t s ■ M(s,s + l,-t). (19) 

The cumulative function 1 — x^if) is then given as 

1 - x^t) = V( J" + ^ • M(a M , 5 M + 1, -t). (20) 



Thus, we obtain 



:i-x r = (i-x l)ir/ r ^ + y J Mi t S0+ lr ty ) . (2D 



Af(ai, Sl + l,-t) V r(« + l) 
In the limit s M — > 0, both T(s fl + 1) and M(s M , s M + 1, —t) become equal to 1 and the following 
relation holds. 

l- Xl = (l- Xo ) Q , 0<x ,xi<l. (22) 

Thus, the scale- invariant relation holds over the entire range < xq,xi < 1. The feature is 
remarkable from the viewpoint of statistical physics. Usually, the power-law relation does hold 
only in the tail. 

The relative probability that a candidate gets the first vote (t = 0) is given by s„. If the 
candidate get the first vote, his/her score increases by 1 and the relative probability becomes 
+ 1. In the limit — > 0, the additional score +1 or the weight of a single vote becomes 
crucially important. The probability that the candidate gets the next vote becomes equal to 
1, which is exemplified by the behaviour of p^, given by eq.(4). 

fti = ~7~Z~\ = at At n 1 if M ^ °- ( 23 ) 
Z + 1 N s + Nisi + 1 

After infinite counts of voting, the candidate occupies the first position in the order of can- 
didates according to the number of votes. Then, we neglect this candidates in the voting 
problem and consider the remaining N — 1 candidates. Similarly, if a candidate is selected 
randomly with the relative probability s«, he/she occupies the second position. Thus, the 
voting problem reduces to a random choice problem with the relative probability in the 
limit {s^} — > 0. At (xq,xi) on the ROC curve, the probability that the next candidate belongs 
category p is proportional to (1 — x^)s^ . The coordinates of the ROC curve (xo,xi) grow 
according to the following relation: 

dx/j, oc (1 - X/j,) ■ s M . 



7/19 



J. Phys. Soc. Jpn. Full Paper 

Solving this relation, we get eq.(22). 

Finally, we discuss the limit in the derivation of the exact scale invariance. In the derivation 
of the gamma distribution, we take the thermodynamic limit Zq = N±s\ + NqSq —> oo. With 
the gamma distribution, eq.(22) holds in the limit {s^} — > 0. In order that eq.(22) holds, these 
two limits, Zq — >■ oo and {s^} — > 0, should go together, {s^} approaches zero more slowly than 
{Nfj,} approaches infinity. In other words, in the double scaling limit Zq — > oo and {s^} — > 
with a = si/sq fixed, eq.(22) holds. So the above intuitive explanation of the exact scale 
invariance may be too naive. If we take the limit {s^} — > without the limit Zq — > oo, p^ 
becomes 1. The firstly chosen candidate get all the remaining votes and there does not occur 
the mixing of the binary candidates. The double scaling limit is crucial in the emergence of 
the exact scale invariance. 

4. Data Analysis of Horse Races 

We verify the results of the voting model, particularly the scale invariance in the mixing 
of binary candidates. We study all the data on horse race betting obtained from the Japan 
Racing Association (JRA) for the period 1986 to 2006. There have been 71549 races and in 
which a total of 901366 horses have participated. We select the winning horses as candidate 
belonging to category p, = 1. For candidate belonging to category /i = 0, we consider two cases; 
losing horses and horses finishing second. In a race, no one knows which horse will win. Betters 
only have partial information on the horses, which is embedded in the initial values {s^}. The 
results of betting are announced at short intervals. Betters usually presume that the horses 
which get many votes are strong. They come to know which horses are considered to be strong 
by other betters. These features are incorporated in the voting model. Betters do not always 
bet to strong horses. Some betters may prefer betting to a horse that can coin more money 
even if it is considered to be 'weaker' than a horse that can coin less money. However, in the bet 
to win, only the better who bets to the winning horse coin the bet. Hence, the assumption is 
not so unrealistic. We also note the reason why we can treat multiple categories, 2nd finishing 
horse and losing horse, as the category p = 0. For the betters, the only difference between the 
losing horses and finishing second ones is their confidence. By tuning parameter sq, we can 
treat the two categories on the same footing. 

Next, we explain the meaning of the initial values {s^}. The probability that a candidate 
p is selected is proportional to as < AJ^ >= s^/Zq. The ratio si/so is a measure of the 
accuracy of the knowledge of betters. On the other hand, p^ is given by eq.(4). If the scale of 
{s^} is small, the decisions of betters are crucially affected by the choices of other betters. In 
the limit {s^} —> oo, their decisions are not affected by the choices of other betters. The scale 
of {s^} is a measure of the degree of similarity ('copycat') of the choices of betters. 

In the early stage of voting, {s^} is the only available information. Voters decide on horses 
on the basis of {s^} and they are 'intelligent', because their decisions are not affected by the 
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choices of other betters. As the voting process proceeds, the importance of the cumulative 
number of votes exceeds that of the initial scores, and voters become 'copycat'. If one control 
the scale of {s^} (or the weight of a single vote), the passage timing from the initial 'intelligent' 
stage to the late 'copycat' stage should change. 



Tabic I. Data on horse race betting obtained from the Japan Racing Association (JRA) for the period 
1986 to 2006. There arc 71549 races and 71650 winning (finishing first) horses. 71590 horses are 
finishing second. The difference between N Wm and N 2nd indicates that there occurs a tie in the 
race. In the third column, we show the average value of the share of the votes in each category. 
The fourth column shows the values v" /c, here c is the estimated value of the scale parameter in 
(24). About the estimation of c, please see the main text and Figure 2. 

Category v N u v u [%] v u /c 

Win 71650 21.23 1.769 

2nd 71590 15.40 1.283 

Lose 829716 6.80 0.567 



We denote the three categories of horses as v £ {Win, 2nd, Lose} and the number of 
horses in each category as N v . v\ denotes the share of votes of the ith horse in the category 
is, and v v denotes the average value of v\. In Table I, we summarize the data on horse races. 
A difference between N Wm and N 2nd indicates that there is a tie in the race. 

We have shown that the share of votes, u, obeys a gamma distribution function with s^. 
In order to check whether v\ obeys a gamma distribution function, we have to set the scale c 
between v\ and u as follows: 

v\ = c ■ u. (24) 

The same c should be used for all categories. Assuming that u obeys the gamma probability 
distribution with s„, v\ obeys the following probability distribution function: 

p(v- =v)= p,» = — J -) 8 "- 1 exp(--). 

c • 1 (s u ) c c 

The expectation value of v\ is 

/•oo 

< v\ > M = / p Sv {v)vdv = c-s v . 



If we set c, it is possible to estimate s u of the horses in category v as s v = v v jc. 

Figure 2 shows the probability distribution functions p(v) of v?. In the same figure, we 
show the result of fitting with the gamma probability functions. Using the least square method 
in the range v € [0.01, 1.0], we set c = 0.12 and swin = 1-659, S2 n d — 1-258 and SLose = 0.529. 
Comparing with the values in the fourth column in Table I, it is observed the values of 
s u and v u /c are close to each other in all categories, implying that the bulk shapes of the 
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Fig. 2. Logarithmic plot of probability distribution functions p(v) of shares of votes. The curves from 
the top to bottom indicate the data for v =Win (solid), 2nd (dashed) and Lose (dotted). The 
gamma distribution functions with c = 0.12 and s v are also plotted (chain lines). 



probability functions of v\ are well described by the gamma distributions. We also notice 
clear discrepancies in the figure. p(v) does not obey the gamma distribution for the larger 
shares. The bulk shape of p(v) is not crucial in our argument, because we are interested in the 
critical properties, or small win bet fraction regime. We think the discrepancies come from 
that the voters' confidence s M has some variance. 

We study the cumulative functions 1 — in the small share region, v — > 0. Figure 3 shows 
the cumulative functions D(v ) of vf, which is defined as 

D[v) = I p(v)dv. (25) 
J o 

We are interested in the power law behaviour of D{v) oc v s ^ and the figure shows the double 
logarithmic plot. We see that they do not obey the power law, as have been predicted in 
eq.(20). In the figure, we show the result of fitting result with (v — v c ) s ^. We set v c = 0.0014 
and the figure shows that the winning and finishing second horses's D(v ) obey the power law 
with cut-off v c . On the other hand, about the losing horses, the fitting only works for the 
region D(v) > D c = 0.003 and v > v c . The reason why D(v) does not obey the power law is 
not clear. We think that there are some voters who want to vote to the horses with remarkable 
small shares. The odds are very large and for the voters, the horses look very attractive. If 
so, we can understand the existence of the cut-off v c . 

We study the mixing properties of the binary horses by employing the method explained 
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3. Double-logarithmic plot of cumulative distribution functions D(v) of shares of votes. The 
curves from the top to bottom indicate the data for v = Lose, 2nd and Win. The fitted functions 
(v — u c ) s f are also plotted. We set v c = 0.0014. 




4. Double-logarithmic plot of ROC curves (1 — xq, 1 — Xi). The curves of the Win-Lose pair (solid 
line) and the Win-2nd pair (dashed line) are plotted. The fitting curves given by 1— x\ = a-(l— xo) a 
(dash-dotted line) are also plotted. 
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Table II. The initial value s^, in each category and the critical exponent a are plotted. In the last 
two column, we show the predicted values of a by the voting model. 

Pair si so s[ s' Q a si/so s i/ s b 

Win vs Lose 1.659 0.529 2.03 1.15 1.81 3.134 1.765 
Win vs 2nd 1.659 1.258 2.03 1.86 1.12 1.318 1.091 



in the text. We adopt the Win-Lose pair and Win-2nd pair as the binary pairs. Figure 4 shows 
the double-logarithmic plot of the ROC curve (1 — xq, 1 — x\) for the two pairs. The plots 
show scale-invariant behaviour over the wide range of 1 — x\. In the case of the Win-2nd pair, 
scale invariance holds over the range 10~ 5 < 1 — x\ < 10 _1 , which can be anticipated from the 
bahaviour of D{v). About the Win-Lose pair, the range is restricted for 1 — xq > D c = 0.003. 
In order to see the scale invariance for the region 1 — xq < D c , many more results of the races 
(N Win 10 6 ) are necessary. Using the least square method in the range < 1 — xq < 0.1, we 
estimate the critical exponent a. The values of the parameters and other data are summarized 
in Table II. The estimated values of a are considerably different from those predicted by the 
model; i.e. s\/sq. In the table, we also show the values s'^ estimated by fitting D{v) with 
(v — v c ) s v. The estimated values of a are closer than the values from the bulk values s„. 

5. Concluding Remarks 

In this study, we have introduced a simple voting model in order to discuss the mixing 
of binary candidates with initial number of votes sq and s±. As the voting process proceeds, 
the candidates are mixed in the space of the share of votes, u. We have shown that the 
probability distribution of u of a candidate \jl obeys a gamma distribution function with the 
shape exponent in the thermodynamic limit Zq —¥ 0. The joint probability distribution of 
k different candidates is given as the direct product of the gamma distributions. The mixing 
configuration of the binary candidates exhibits scale invariance in the small u region. In 
particular, in the double scaling limit Zq — > oo and {s^} —¥ with a = s\/so fixed, the scale 
invariance holds over the entire range. The cumulative function of the binary candidates obeys 

1 — X\ = (1 — Xq) 01 for < Xq, X\ < 1. 

The data on horse races obtained from JRA also show that scale invariance holds over the 
wide range of cumulative functions. The distribution functions of the share of votes, u, are to 
some extent described by the gamma distribution functions, implying that the behaviour of 
betters is described by the voting model. However a clear discrepancy is observed in the critical 
behaviour. The bulk properties of the probability function p{v) and the critical properties of 
the cumulative functions D{v) should be discussed separately. Although our voting model 
describes the mechanism of scale invariance in the mixing of binary candidates, it may be 
too simple to describe the behaviour of betters in real cases. Thus far, dividends have been 
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Fig. 5. Voting model and Random Young diagram model. As the voting process proceeds, the order 
of the binary candidates and the Young diagram change. The complementary space of the ROC 
curve corresponds the Young diagram. 



reported to exhibit power-law behaviour. Another betting model has been proposed in. 12 ' 13 ) 
A detailed study of real data, in particular the time series of the number of votes, should 
clarify the mechanism of scale invariance in betting systems. 14 ) 

We also note that our model is related to the random Young diagram problem. 15 ) This 
problem pertains to the probabilistic growth of a Young diagram. A parabolic shape 16 ) and a 
quadrant shape 17 ) have been obtained for the asymptotic shape. The complementary part of 
the ROC curve, which is embedded in the fourth quadrant, corresponds the Young diagram. 
In our model, the ROC curve (xo(t) , xi(t)) given by (15) describes the asymptotic shape of the 
Young diagram. In particular, it is described by the relation 1 — x\ = (1 — xq)° in the double 
scaling limit. Figure 5 shows the correspondence between the voting model and the random 
Young diagram problem. As the voting process proceeds, the order of the binary candidates 
and the Young diagram change. 

It is also possible to study the voting model with many categories of candidates with the 
usage of many different initial values {s^}. u of the candidates in each category becomes a 
gamma distributed random variable. Scale invariance does hold between any pair of categories. 
Figure 6 shows the triple-logarithmic plot of the cumulative functions of the winning (x\ s t), 
finishing second (x2nd) and finishing third {x^rd) horses. In the linear part of the curve, scale 
invariance holds between any pair of categories. 

Acknowledgment 

We thank Dr. Emmanuel Guitter for his useful discussions on the branching process. We 
also thank Takumi Nakaso for helping us with the data analysis of horse races. This work was 
supported by Grant-in- Aid for Challenging Exploratory Research 21654054. 



13/19 



J. Phys. Soc. Jpn. 



Full Paper 



1st vs 2nd vs 3rd 




Fig. 6. Triple-logarithmic plot of ROC curve (1 — X\ s t, 1 — X2nd, 1 — x^ r d). x v denotes the cumulative 
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Appendix A: Joint probability distribution function 

We start from the expression of the joint probability function given by 
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Here, s flk+1 



Zq — Ylj=i s tJ.j an d n-k+i = T — Y^j=i n j- Using the Dirichlet distribution 
function, we can rewrite the expression as 
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The expectation value of AA"^ = X^ J t+1 — X^ t is given by 



*» =< ^ >= f 



(A-2) 
(A-3) 
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The correlation between AX^ ( and AX- lk t (k ^ j) is given as 
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By changing the integral variables from {pi}i=i t ... ,& to {/ii}i=i,... ,& as = (1 — 5Zt=i Pj)^ 1 



n}=i(l ~~ hj)hi, we obtain 
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We focus on the share of votes of candidates in the limit T — > oo. We introduce as 



(T — X^j=i n i)y« = ^ll}=i(l ~~ Vj)Vi an d define the joint distribution function as 

/'Si.'/,}, I- /,) = EmProb.({^ T - 8 N = r JJ(1 - ///)//,}, ,....) • II( r " E^)- ( A - 6 ) 

;=i i=i j=i 

The joint function P ({yj}j=i, - ,k) is given by 

r(z ) Ar i 
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(A-7) 



We introduce the variable Xj as Xj = (1 — X^=i x j)Vii which is related to rij as ni = T ■ x 
The joint probability function P({xj}j = i^... k) is then given as 

k r -i fe 



P({^}y=i,..., fc ) 
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Finally, we introduce the variable {u{\ as Ui = (s^ k+1 — l)xj. In the thermodynamic limit 
ZoiS/j. k+1 ~^ °°j we obtain 

n^(%)- ( A - 9 ) 
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Appendix B: Continuous time branching process 

We translate the discrete time voting problem {Xi t }i=i,— ,iv„ to a continuous time branch- 



ing process {X^(t)}i- 



i,-,N L 



, 18 ) because the latter is more tractable than the former. 19 ) Figure 



B-l shows the mapping process. Let X^(t) denote the number of offspring of individuals. 
Each individual is substituted by two offspring at its death (branching) and the probability 
that an individual dies during time dt is given by dt. The number of offspring of each individual 
is denoted as {xf k (t)} k=1) ... )S/J . 



(B-1) 



k=l 
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The substitution of the individuals by two offspring corresponds to the process of getting a 
vote. The frequency of deaths or the probability of getting another vote is proportional to 
X- (i). This relation is the same as that in the discrete time voting model. The counts of 
voting, t, corresponds to the counts of branchings. If branching takes place t times up to t' , 
the following relation holds. 

*f (0 = Kt 



<3:m=i x:^ = o © :Vote #: offspring si = 2,sq = 1 




)6 - s M 3 1 2 



Fig. Bl. Mapping voting model to branching process. The left-hand side figure shows a voting pro- 
cess with Ni = No = 2. O represents candidate belonging to category fj, = l,si = 2 and x 
represents candidate belonging to category// = 0, so = 1- The right-hand side figure shows the 
corresponding branching process. • represents the initial individual and offspring. Candidate be- 
longing to category fj, = 1(0) is composed of two individuals (one individual). 

The expectation values < xf k (t) > and < X? (t) > increase with e*. Next, we introduce 
the scaled variables L7f (i) and uf k (t) as 

U?(t) = e*X?(t) and u&(t) = rt^t). (B-2) 

We focus on the following probability distributions: 

p s Au)du = lim Prob(ii < U?{t) <u + du) (B-3) 

t— >oo 

p(u)du = lim Prob(u < uf k (t) < u + du). (B-4) 

In order to obtain p(u), we consider the situation in which an individual splits at t = t for 
the first time. The resulting two offspring continue the branching process. The scaled number 
of offspring of the individual is denoted as u. Those of the two offspring are denoted as u\ and 
U2- Figure B-2 gives a pictorial representation of the relation among u, U\ and U2- We observe 



17/19 



J. Phys. Soc. Jpn. 



Full Paper 







r 



t — >• oo 



ui 



U2 



U 



Fig. B-2. Pictorial representation of self-consistent relation among u, Ui and Ui- An individual splits 
at t = t for the first time producing two offspring appears. Because of the time lag r, the relation 
u = (u\ + U2)e~ T holds. 



that these variables satisfy the following relation: 

u = (u\+ U2)e~ T . 

Furthermore, U\ and u 2 obey the same probability distribution as that obeyed by u, and the 
probability that an individual splits for the first time during r < t < t + dr is e~ T dr. Thus, 
we obtain 

rco /■oo /*oo 

p(u) = / e~ T dr / dui / du 2 p(ui)p(u2)5(u - {u\ + u 2 )e' T ). (B-5) 



Introducing X = e T , the relation is rewritten as 

p(u) = / dX / du\ J du2p{ui)p(u2)5(u — (u\ + U2)X). (B-6) 
Jo Jo Jo 

Using the Laplace transform of p(u), p(s) = J °° p(u)e~ su du, it can be shown that p(s) satisfies 
the following integral equation: 

1 f s 

P (s) = - p(v)dv. (B-7) 



Differentiating (B-7) with respect to s, we obtain the following differential equation. 

P 2 (s)-p( s ). (B-8) 



dp(s) _ 



ds 

(B-8) can be solved easily to obtain 



m = t-^ — ■ (b-9) 

1 + as 
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Using the normalization condition < u >= 1 and the inverse Laplace transform, we get 

p( u ) = e ~ u . (B-10) 



We obtain p s (u) by convolution as 



Ps 



n 

i=i 
1 



duip{ui 



vT x e 



(B-ll) 



C/^ obeys a gamma distribution with the shape exponent given by (10). We note that the 
result (10) is derived in the thermodynamic limit, where the correlation among {uj}j=i,— k 
vanishes. On the other hand, in the continuous time branching process, the splitting processes 
of each individual and offspring are independent of each other. As a result, we obtain the 
gamma distribution which appears in the voting model in the thermodynamic limit. 
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